<!DOCTYPE html>

<html lang="en">
  <head>
    <meta charset="utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="generator" content="Docutils 0.19: https://docutils.sourceforge.io/" />

    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
    <meta http-equiv="x-ua-compatible" content="ie=edge">
    
    <title>3.4.3. 多目标损失融合 &#8212; FunRec 推荐系统 0.0.1 documentation</title>

    <link rel="stylesheet" href="../../_static/material-design-lite-1.3.0/material.blue-deep_orange.min.css" type="text/css" />
    <link rel="stylesheet" href="../../_static/sphinx_materialdesign_theme.css" type="text/css" />
    <link rel="stylesheet" href="../../_static/fontawesome/all.css" type="text/css" />
    <link rel="stylesheet" href="../../_static/fonts.css" type="text/css" />
    <link rel="stylesheet" type="text/css" href="../../_static/pygments.css" />
    <link rel="stylesheet" type="text/css" href="../../_static/basic.css" />
    <link rel="stylesheet" type="text/css" href="../../_static/d2l.css" />
    <script data-url_root="../../" id="documentation_options" src="../../_static/documentation_options.js"></script>
    <script src="../../_static/jquery.js"></script>
    <script src="../../_static/underscore.js"></script>
    <script src="../../_static/_sphinx_javascript_frameworks_compat.js"></script>
    <script src="../../_static/doctools.js"></script>
    <script src="../../_static/sphinx_highlight.js"></script>
    <script src="../../_static/d2l.js"></script>
    <script async="async" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>
    <link rel="index" title="Index" href="../../genindex.html" />
    <link rel="search" title="Search" href="../../search.html" />
    <link rel="next" title="3.4.4. 小结" href="4.summary.html" />
    <link rel="prev" title="3.4.2. 任务依赖建模" href="2.dependency_modeling.html" /> 
  </head>
<body>
    <div class="mdl-layout mdl-js-layout mdl-layout--fixed-header mdl-layout--fixed-drawer"><header class="mdl-layout__header mdl-layout__header--waterfall ">
    <div class="mdl-layout__header-row">
        
        <nav class="mdl-navigation breadcrumb">
            <a class="mdl-navigation__link" href="../index.html"><span class="section-number">3. </span>精排模型</a><i class="material-icons">navigate_next</i>
            <a class="mdl-navigation__link" href="index.html"><span class="section-number">3.4. </span>多目标建模</a><i class="material-icons">navigate_next</i>
            <a class="mdl-navigation__link is-active"><span class="section-number">3.4.3. </span>多目标损失融合</a>
        </nav>
        <div class="mdl-layout-spacer"></div>
        <nav class="mdl-navigation">
        
<form class="form-inline pull-sm-right" action="../../search.html" method="get">
      <div class="mdl-textfield mdl-js-textfield mdl-textfield--expandable mdl-textfield--floating-label mdl-textfield--align-right">
        <label id="quick-search-icon" class="mdl-button mdl-js-button mdl-button--icon"  for="waterfall-exp">
          <i class="material-icons">search</i>
        </label>
        <div class="mdl-textfield__expandable-holder">
          <input class="mdl-textfield__input" type="text" name="q"  id="waterfall-exp" placeholder="Search" />
          <input type="hidden" name="check_keywords" value="yes" />
          <input type="hidden" name="area" value="default" />
        </div>
      </div>
      <div class="mdl-tooltip" data-mdl-for="quick-search-icon">
      Quick search
      </div>
</form>
        
<a id="button-show-source"
    class="mdl-button mdl-js-button mdl-button--icon"
    href="../../_sources/chapter_2_ranking/4.multi_objective/3.multi_loss_optim.rst.txt" rel="nofollow">
  <i class="material-icons">code</i>
</a>
<div class="mdl-tooltip" data-mdl-for="button-show-source">
Show Source
</div>
        </nav>
    </div>
    <div class="mdl-layout__header-row header-links">
      <div class="mdl-layout-spacer"></div>
      <nav class="mdl-navigation">
          
              <a  class="mdl-navigation__link" href="https://funrec-notebooks.s3.eu-west-3.amazonaws.com/fun-rec.zip">
                  <i class="fas fa-download"></i>
                  Jupyter 记事本
              </a>
          
              <a  class="mdl-navigation__link" href="https://github.com/datawhalechina/fun-rec">
                  <i class="fab fa-github"></i>
                  GitHub
              </a>
      </nav>
    </div>
</header><header class="mdl-layout__drawer">
    
          <!-- Title -->
      <span class="mdl-layout-title">
          <a class="title" href="../../index.html">
              <span class="title-text">
                  FunRec 推荐系统
              </span>
          </a>
      </span>
    
    
      <div class="globaltoc">
        <span class="mdl-layout-title toc">Table Of Contents</span>
        
        
            
            <nav class="mdl-navigation">
                <ul>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_preface/index.html">前言</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_installation/index.html">安装</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_notation/index.html">符号</a></li>
</ul>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="../../chapter_0_introduction/index.html">1. 推荐系统概述</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_0_introduction/1.intro.html">1.1. 推荐系统是什么？</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_0_introduction/2.outline.html">1.2. 本书概览</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_1_retrieval/index.html">2. 召回模型</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_1_retrieval/1.cf/index.html">2.1. 协同过滤</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_1_retrieval/1.cf/1.itemcf.html">2.1.1. 基于物品的协同过滤</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_1_retrieval/1.cf/2.usercf.html">2.1.2. 基于用户的协同过滤</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_1_retrieval/1.cf/3.mf.html">2.1.3. 矩阵分解</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_1_retrieval/1.cf/4.summary.html">2.1.4. 总结</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_1_retrieval/2.embedding/index.html">2.2. 向量召回</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_1_retrieval/2.embedding/1.i2i.html">2.2.1. I2I召回</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_1_retrieval/2.embedding/2.u2i.html">2.2.2. U2I召回</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_1_retrieval/2.embedding/3.summary.html">2.2.3. 总结</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_1_retrieval/3.sequence/index.html">2.3. 序列召回</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_1_retrieval/3.sequence/1.user_interests.html">2.3.1. 深化用户兴趣表示</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_1_retrieval/3.sequence/2.generateive_recall.html">2.3.2. 生成式召回方法</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_1_retrieval/3.sequence/3.summary.html">2.3.3. 总结</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1 current"><a class="reference internal" href="../index.html">3. 精排模型</a><ul class="current">
<li class="toctree-l2"><a class="reference internal" href="../1.wide_and_deep.html">3.1. 记忆与泛化</a></li>
<li class="toctree-l2"><a class="reference internal" href="../2.feature_crossing/index.html">3.2. 特征交叉</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../2.feature_crossing/1.second_order.html">3.2.1. 二阶特征交叉</a></li>
<li class="toctree-l3"><a class="reference internal" href="../2.feature_crossing/2.higher_order.html">3.2.2. 高阶特征交叉</a></li>
<li class="toctree-l3"><a class="reference internal" href="../2.feature_crossing/3.summary.html">3.2.3. 总结</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../3.sequence.html">3.3. 序列建模</a></li>
<li class="toctree-l2 current"><a class="reference internal" href="index.html">3.4. 多目标建模</a><ul class="current">
<li class="toctree-l3"><a class="reference internal" href="1.arch.html">3.4.1. 基础结构演进</a></li>
<li class="toctree-l3"><a class="reference internal" href="2.dependency_modeling.html">3.4.2. 任务依赖建模</a></li>
<li class="toctree-l3 current"><a class="current reference internal" href="#">3.4.3. 多目标损失融合</a></li>
<li class="toctree-l3"><a class="reference internal" href="4.summary.html">3.4.4. 小结</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../5.multi_scenario/index.html">3.5. 多场景建模</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../5.multi_scenario/1.multi_tower.html">3.5.1. 多塔结构</a></li>
<li class="toctree-l3"><a class="reference internal" href="../5.multi_scenario/2.dynamic_weight.html">3.5.2. 动态权重建模</a></li>
<li class="toctree-l3"><a class="reference internal" href="../5.multi_scenario/3.summary.html">3.5.3. 小结</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_3_rerank/index.html">4. 重排模型</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_3_rerank/1.greedy.html">4.1. 基于贪心的重排</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_3_rerank/2.personalized.html">4.2. 基于个性化的重排</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_3_rerank/3.summary.html">4.3. 本章小结</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_4_trends/index.html">5. 难点及热点研究</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_4_trends/1.debias.html">5.1. 模型去偏</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_4_trends/2.cold_start.html">5.2. 冷启动问题</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_4_trends/3.generative.html">5.3. 生成式推荐</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_4_trends/4.summary.html">5.4. 本章小结</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_5_projects/index.html">6. 项目实践</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/1.understanding.html">6.1. 赛题理解</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/2.baseline.html">6.2. Baseline</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/3.analysis.html">6.3. 数据分析</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/4.recall.html">6.4. 多路召回</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/5.feature_engineering.html">6.5. 特征工程</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/6.ranking.html">6.6. 排序模型</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_appendix/index.html">7. Appendix</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_appendix/word2vec.html">7.1. Word2vec</a></li>
</ul>
</li>
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_references/references.html">参考文献</a></li>
</ul>

            </nav>
        
        </div>
    
</header>
        <main class="mdl-layout__content" tabIndex="0">

	<script type="text/javascript" src="../../_static/sphinx_materialdesign_theme.js "></script>
    <header class="mdl-layout__drawer">
    
          <!-- Title -->
      <span class="mdl-layout-title">
          <a class="title" href="../../index.html">
              <span class="title-text">
                  FunRec 推荐系统
              </span>
          </a>
      </span>
    
    
      <div class="globaltoc">
        <span class="mdl-layout-title toc">Table Of Contents</span>
        
        
            
            <nav class="mdl-navigation">
                <ul>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_preface/index.html">前言</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_installation/index.html">安装</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_notation/index.html">符号</a></li>
</ul>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="../../chapter_0_introduction/index.html">1. 推荐系统概述</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_0_introduction/1.intro.html">1.1. 推荐系统是什么？</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_0_introduction/2.outline.html">1.2. 本书概览</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_1_retrieval/index.html">2. 召回模型</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_1_retrieval/1.cf/index.html">2.1. 协同过滤</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_1_retrieval/1.cf/1.itemcf.html">2.1.1. 基于物品的协同过滤</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_1_retrieval/1.cf/2.usercf.html">2.1.2. 基于用户的协同过滤</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_1_retrieval/1.cf/3.mf.html">2.1.3. 矩阵分解</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_1_retrieval/1.cf/4.summary.html">2.1.4. 总结</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_1_retrieval/2.embedding/index.html">2.2. 向量召回</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_1_retrieval/2.embedding/1.i2i.html">2.2.1. I2I召回</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_1_retrieval/2.embedding/2.u2i.html">2.2.2. U2I召回</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_1_retrieval/2.embedding/3.summary.html">2.2.3. 总结</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_1_retrieval/3.sequence/index.html">2.3. 序列召回</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_1_retrieval/3.sequence/1.user_interests.html">2.3.1. 深化用户兴趣表示</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_1_retrieval/3.sequence/2.generateive_recall.html">2.3.2. 生成式召回方法</a></li>
<li class="toctree-l3"><a class="reference internal" href="../../chapter_1_retrieval/3.sequence/3.summary.html">2.3.3. 总结</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1 current"><a class="reference internal" href="../index.html">3. 精排模型</a><ul class="current">
<li class="toctree-l2"><a class="reference internal" href="../1.wide_and_deep.html">3.1. 记忆与泛化</a></li>
<li class="toctree-l2"><a class="reference internal" href="../2.feature_crossing/index.html">3.2. 特征交叉</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../2.feature_crossing/1.second_order.html">3.2.1. 二阶特征交叉</a></li>
<li class="toctree-l3"><a class="reference internal" href="../2.feature_crossing/2.higher_order.html">3.2.2. 高阶特征交叉</a></li>
<li class="toctree-l3"><a class="reference internal" href="../2.feature_crossing/3.summary.html">3.2.3. 总结</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../3.sequence.html">3.3. 序列建模</a></li>
<li class="toctree-l2 current"><a class="reference internal" href="index.html">3.4. 多目标建模</a><ul class="current">
<li class="toctree-l3"><a class="reference internal" href="1.arch.html">3.4.1. 基础结构演进</a></li>
<li class="toctree-l3"><a class="reference internal" href="2.dependency_modeling.html">3.4.2. 任务依赖建模</a></li>
<li class="toctree-l3 current"><a class="current reference internal" href="#">3.4.3. 多目标损失融合</a></li>
<li class="toctree-l3"><a class="reference internal" href="4.summary.html">3.4.4. 小结</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="../5.multi_scenario/index.html">3.5. 多场景建模</a><ul>
<li class="toctree-l3"><a class="reference internal" href="../5.multi_scenario/1.multi_tower.html">3.5.1. 多塔结构</a></li>
<li class="toctree-l3"><a class="reference internal" href="../5.multi_scenario/2.dynamic_weight.html">3.5.2. 动态权重建模</a></li>
<li class="toctree-l3"><a class="reference internal" href="../5.multi_scenario/3.summary.html">3.5.3. 小结</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_3_rerank/index.html">4. 重排模型</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_3_rerank/1.greedy.html">4.1. 基于贪心的重排</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_3_rerank/2.personalized.html">4.2. 基于个性化的重排</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_3_rerank/3.summary.html">4.3. 本章小结</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_4_trends/index.html">5. 难点及热点研究</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_4_trends/1.debias.html">5.1. 模型去偏</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_4_trends/2.cold_start.html">5.2. 冷启动问题</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_4_trends/3.generative.html">5.3. 生成式推荐</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_4_trends/4.summary.html">5.4. 本章小结</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_5_projects/index.html">6. 项目实践</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/1.understanding.html">6.1. 赛题理解</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/2.baseline.html">6.2. Baseline</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/3.analysis.html">6.3. 数据分析</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/4.recall.html">6.4. 多路召回</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/5.feature_engineering.html">6.5. 特征工程</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_5_projects/6.ranking.html">6.6. 排序模型</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_appendix/index.html">7. Appendix</a><ul>
<li class="toctree-l2"><a class="reference internal" href="../../chapter_appendix/word2vec.html">7.1. Word2vec</a></li>
</ul>
</li>
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../chapter_references/references.html">参考文献</a></li>
</ul>

            </nav>
        
        </div>
    
</header>

    <div class="document">
        <div class="page-content" role="main">
        
  <section id="multi-loss-optim">
<span id="id1"></span><h1><span class="section-number">3.4.3. </span>多目标损失融合<a class="headerlink" href="#multi-loss-optim" title="Permalink to this heading">¶</a></h1>
<p>多目标往往伴随着多个损失的联合优化，这类优化方法更多的考虑的是在模型结构已经确定的条件下，结合任务的特点对模型进行训练和参数优化。简单的多目标Loss优化，是通过手工结合业务经验设定不同损失的权重，将多个损失加权为一个进行优化，如下所示：</p>
<div class="math notranslate nohighlight" id="equation-chapter-2-ranking-4-multi-objective-3-multi-loss-optim-0">
<span class="eqno">(3.4.15)<a class="headerlink" href="#equation-chapter-2-ranking-4-multi-objective-3-multi-loss-optim-0" title="Permalink to this equation">¶</a></span>\[Loss_{total} = \sum_i w_i L_i\]</div>
<p>其中，<span class="math notranslate nohighlight">\(L_i\)</span>和<span class="math notranslate nohighlight">\(w_i\)</span>分别表示第i个任务的损失及对应的权重。</p>
<p>在多目标建模中，当模型结构确定后，损失函数融合策略成为决定模型性能的关键因素。传统的手工加权方法存在三个本质性缺陷：</p>
<ul class="simple">
<li><p>量级失衡问题：不同任务的损失值量级差异显著（如CTR损失通常在0.1-0.5，CVR损失可达2.0+），导致大损失主导优化方向</p></li>
<li><p>收敛异步问题：稀疏任务收敛慢，密集任务收敛快，造成过拟合与欠拟合并存</p></li>
<li><p>梯度冲突问题：任务梯度方向不一致时产生抵消效应（如CTR与CTR任务梯度夹角&gt;90°）</p></li>
</ul>
<p>下面系统解析三大主流优化方法，包含理论框架、实现机制与工程实践。</p>
<section id="uncertainty-weight">
<h2><span class="section-number">3.4.3.1. </span>Uncertainty Weight：基于不确定性的自适应加权<a class="headerlink" href="#uncertainty-weight" title="Permalink to this heading">¶</a></h2>
<p>基于不确定性加权损失（Uncertainty Weighted Loss, UWL）
<span id="id2">(<a class="reference internal" href="../../chapter_references/references.html#id93" title="Kendall, A., Gal, Y., &amp; Cipolla, R. (2018). Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7482–7491).">Kendall <em>et al.</em>, 2018</a>)</span>
。UWL的核心思想是根据任务的不确定性动态调整权重，具体来说，任务的损失越大，分配的权重越小。论文指出，在任务训练过程中存在两种不确定性：一种是认知不确定性（epistemic
uncertainty），源于数据的缺乏；另一种是偶然不确定性（aleatoric
uncertainty），源于数据本身或任务本身的特性。
在UWL中，任务的损失函数可以表示为：</p>
<div class="math notranslate nohighlight" id="equation-chapter-2-ranking-4-multi-objective-3-multi-loss-optim-1">
<span class="eqno">(3.4.16)<a class="headerlink" href="#equation-chapter-2-ranking-4-multi-objective-3-multi-loss-optim-1" title="Permalink to this equation">¶</a></span>\[Loss = \frac{1}{2\sigma_1^2} \mathcal{L}_1(\mathbf{W}) + \frac{1}{\sigma_2^2} \mathcal{L}_2(\mathbf{W}) + \log \sigma_1 + \log \sigma_2\]</div>
<p>其中，<span class="math notranslate nohighlight">\(\sigma\)</span>表示的是任务的不确定性(uncertainty),是可学习的参数。从公式可以看出，当loss较大且<span class="math notranslate nohighlight">\(\sigma\)</span>较小时，<span class="math notranslate nohighlight">\(\frac{1}{2\sigma^2} \mathcal{L}(\mathbf{W})\)</span>会很大，损失函数在优化的时候就会将其往小了优化。可以直观的理解为，模型不会让任务往不确定性较大的方向大幅更新参数。</p>
</section>
<section id="gradnorm">
<h2><span class="section-number">3.4.3.2. </span>GradNorm：梯度标准化方法<a class="headerlink" href="#gradnorm" title="Permalink to this heading">¶</a></h2>
<p>在多任务优化的过程中，不同的任务loss的量级是不一样的，这样带来的问题就是loss大的任务梯度更新的幅度也会更大，进而导致模型在学习的过程中被loss大的任务主导带偏整个模型。此外，不同的任务由于数据分布的原因，loss的收敛速度也是不同的。为了同时考虑loss的量级和训练的速度。GradNorm
<span id="id3">(<a class="reference internal" href="../../chapter_references/references.html#id94" title="Chen, Z., Badrinarayanan, V., Lee, C.-Y., &amp; Rabinovich, A. (2018). Gradnorm: gradient normalization for adaptive loss balancing in deep multitask networks. International conference on machine learning (pp. 794–803).">Chen <em>et al.</em>, 2018</a>)</span>
在模型优化过程中除了正常的任务loss外，还引入了一个gradient
loss，该loss通过梯度下降的方式来更新不同任务的loss权重。并且这两个loss是单独优化的，而不是简单的相加得到一个loss去综合优化。</p>
<p>在介绍gradient
loss之前，我们先来看一下如何定义梯度的量级和loss的学习速度。</p>
<div class="math notranslate nohighlight" id="equation-chapter-2-ranking-4-multi-objective-3-multi-loss-optim-2">
<span class="eqno">(3.4.17)<a class="headerlink" href="#equation-chapter-2-ranking-4-multi-objective-3-multi-loss-optim-2" title="Permalink to this equation">¶</a></span>\[G_{W}^{(i)}(t) \, = \, \|\nabla_{W} w_{i}(t) L_{i}(t)\|_{2}\]</div>
<div class="math notranslate nohighlight" id="equation-chapter-2-ranking-4-multi-objective-3-multi-loss-optim-3">
<span class="eqno">(3.4.18)<a class="headerlink" href="#equation-chapter-2-ranking-4-multi-objective-3-multi-loss-optim-3" title="Permalink to this equation">¶</a></span>\[\overline{G}_{W}(t) \, = \, E_{\text{task}}[G_{W}^{(i)}(t)]\]</div>
<p>其中<span class="math notranslate nohighlight">\(W\)</span>是所有任务loss对多个任务最后一层共享参数，<span class="math notranslate nohighlight">\(G_{W}^{(i)}(t)\)</span>表示任务<span class="math notranslate nohighlight">\(i\)</span>加权后的Loss，对共享参数<span class="math notranslate nohighlight">\(W\)</span>的梯度，该值较大时表示loss
<span class="math notranslate nohighlight">\(i\)</span>当前的梯度量级较大，<span class="math notranslate nohighlight">\(\overline{G}_{W}(t)\)</span>表示所有任务对共享参数梯度的均值。</p>
<div class="math notranslate nohighlight" id="equation-chapter-2-ranking-4-multi-objective-3-multi-loss-optim-4">
<span class="eqno">(3.4.19)<a class="headerlink" href="#equation-chapter-2-ranking-4-multi-objective-3-multi-loss-optim-4" title="Permalink to this equation">¶</a></span>\[\tilde{L}_i(\tilde{t}) = L_i(t) / L_i(0)\]</div>
<div class="math notranslate nohighlight" id="equation-chapter-2-ranking-4-multi-objective-3-multi-loss-optim-5">
<span class="eqno">(3.4.20)<a class="headerlink" href="#equation-chapter-2-ranking-4-multi-objective-3-multi-loss-optim-5" title="Permalink to this equation">¶</a></span>\[r_i(t) = \frac{\tilde{L}_i(t)}{E_{\text{task}}[\tilde{L}_i(t)]}\]</div>
<p><span class="math notranslate nohighlight">\(L_i(t)\)</span>表示的是训练的第t时刻，任务<span class="math notranslate nohighlight">\(i\)</span>的Loss值，所以<span class="math notranslate nohighlight">\(\tilde{L}_i(\tilde{t})\)</span>表示的是任务<span class="math notranslate nohighlight">\(i\)</span>在第t时刻的相对第0时刻的损失比率，该值如果越小的话则代表该任务loss收敛的比较快，训练速度较大。<span class="math notranslate nohighlight">\(r_i(t)\)</span>则是在<span class="math notranslate nohighlight">\(L_i(t)\)</span>的基础上做了一次归一化，让所有任务之间的速率相对可以比较，同样也是值越小表示任务的训练速度越快。</p>
<p>最终的梯度损失函数定义为如下表达式：</p>
<div class="math notranslate nohighlight" id="equation-chapter-2-ranking-4-multi-objective-3-multi-loss-optim-6">
<span class="eqno">(3.4.21)<a class="headerlink" href="#equation-chapter-2-ranking-4-multi-objective-3-multi-loss-optim-6" title="Permalink to this equation">¶</a></span>\[L_{\text{grad}}(t; w_i(t)) = \sum_i \left| G_W^{(i)}(t) - \overline{G}_W(t) \times [r_i(t)]^\alpha \right|_1\]</div>
<p>梯度损失函数综合了上述定义的梯度量级和学习速度，直观理解就是当某个loss的梯度非常大时，该loss的值也会较大进而会将该loss的权重降的更小，避免了梯度大的loss主导了模型的学习。同理当某个任务学习的速度较快时，即<span class="math notranslate nohighlight">\(r_i(t)\)</span>较小，梯度loss也会变得更大，进而使得该loss的权重会变得更小，阻止某个任务过快的收敛。</p>
</section>
<section id="pareto-optimization">
<h2><span class="section-number">3.4.3.3. </span>Pareto Optimization：帕累托优化框架<a class="headerlink" href="#pareto-optimization" title="Permalink to this heading">¶</a></h2>
<p>在多任务学习中，当不同任务的梯度方向存在根本性冲突时（即优化任务A必然损害任务B），我们面临帕累托边界优化问题。传统加权平均方法在此场景下失效，需要专门的优化框架寻找帕累托最优解集
<span id="id4">(<a class="reference internal" href="../../chapter_references/references.html#id95" title="Lin, X., Zhen, H.-L., Li, Z., Zhang, Q.-F., &amp; Kwong, S. (2019). Pareto multi-task learning. Advances in neural information processing systems, 32.">Lin <em>et al.</em>, 2019</a>)</span> ：</p>
<div class="math notranslate nohighlight" id="equation-chapter-2-ranking-4-multi-objective-3-multi-loss-optim-7">
<span class="eqno">(3.4.22)<a class="headerlink" href="#equation-chapter-2-ranking-4-multi-objective-3-multi-loss-optim-7" title="Permalink to this equation">¶</a></span>\[\min_{\theta} \mathbf{L}(\theta) = \min_{\theta} (\mathcal{L}_1(\theta), \mathcal{L}_2(\theta), ..., \mathcal{L}_T(\theta))\]</div>
<p>其中帕累托最优解定义为：不存在其他解能在不损害至少一个任务的情况下改进任一任务。</p>
<p><strong>帕累托最优损失融合核心思想：</strong></p>
<p>将多目标损失合并为加权和，并利用 KKT
条件动态调整权重，使优化方向指向帕累托前沿：</p>
<div class="math notranslate nohighlight" id="equation-chapter-2-ranking-4-multi-objective-3-multi-loss-optim-8">
<span class="eqno">(3.4.23)<a class="headerlink" href="#equation-chapter-2-ranking-4-multi-objective-3-multi-loss-optim-8" title="Permalink to this equation">¶</a></span>\[\mathcal{L}(\theta) = \sum_{i=1}^{K} w_i \mathcal{L}_i (\theta)\]</div>
<p>其中 <span class="math notranslate nohighlight">\(w_i\)</span> 为可学习的权重，满足 <span class="math notranslate nohighlight">\(\sum w_i = 1\)</span> 且
<span class="math notranslate nohighlight">\(w_i \geq c_i\)</span>（<span class="math notranslate nohighlight">\(c_i\)</span> 为权重下限）。</p>
<p><strong>优化步骤（分两步迭代）：</strong></p>
<ol class="arabic">
<li><p>固定权重，更新模型参数 <span class="math notranslate nohighlight">\(\theta\)</span>：通过梯度下降最小化加权损失
<span class="math notranslate nohighlight">\(\mathcal{L}(\theta)\)</span>，即常规的模型训练步骤。</p></li>
<li><p>固定模型，更新权重 <span class="math notranslate nohighlight">\(w_i\)</span></p>
<ul>
<li><p>目标：求解权重 <span class="math notranslate nohighlight">\(w_i\)</span>，使加权梯度的二范数最小化（满足 KKT
条件）：</p>
<div class="math notranslate nohighlight" id="equation-chapter-2-ranking-4-multi-objective-3-multi-loss-optim-9">
<span class="eqno">(3.4.24)<a class="headerlink" href="#equation-chapter-2-ranking-4-multi-objective-3-multi-loss-optim-9" title="Permalink to this equation">¶</a></span>\[\min _{w}\left\|\sum_{i=1}^{K} w_{i} \nabla_{\theta} \mathcal{L}_{i}(\theta)\right\|_{2}^{2}\]</div>
</li>
<li><p>约束条件： <span class="math notranslate nohighlight">\(\sum w_i = 1\)</span>，<span class="math notranslate nohighlight">\(w_i \geq c_i\)</span>。</p></li>
<li><p>松弛与投影：</p>
<ul>
<li><p>引入变量
<span class="math notranslate nohighlight">\(\tilde{w}_i = w_i - c_i\)</span>，将不等式约束转化为非负约束。</p></li>
<li><p>先忽略
<span class="math notranslate nohighlight">\(\tilde{w}_i \geq 0\)</span>，求解带等式约束的二次规划问题。</p></li>
<li><p>对解 <span class="math notranslate nohighlight">\(\tilde{w}^*\)</span>
进行投影，确保其非负性（投影问题可通过闭式解快速求解）：</p>
<div class="math notranslate nohighlight" id="equation-chapter-2-ranking-4-multi-objective-3-multi-loss-optim-10">
<span class="eqno">(3.4.25)<a class="headerlink" href="#equation-chapter-2-ranking-4-multi-objective-3-multi-loss-optim-10" title="Permalink to this equation">¶</a></span>\[\min _{\tilde{w}}\|\tilde{w}-\tilde{w}^*\|_2^2 \quad \text{s.t.} \quad \sum \tilde{w}_i=1-\sum c_i, \tilde{w}_i \geq 0\]</div>
</li>
</ul>
</li>
</ul>
</li>
</ol>
<p>PE-LTR的核心贡献在于将多目标优化的帕累托条件转化为权重学习的二次规划问题，通过交替更新模型参数与损失权重，引导模型收敛至帕累托前沿。</p>
</section>
</section>


        </div>
        <div class="side-doc-outline">
            <div class="side-doc-outline--content"> 
<div class="localtoc">
    <p class="caption">
      <span class="caption-text">Table Of Contents</span>
    </p>
    <ul>
<li><a class="reference internal" href="#">3.4.3. 多目标损失融合</a><ul>
<li><a class="reference internal" href="#uncertainty-weight">3.4.3.1. Uncertainty Weight：基于不确定性的自适应加权</a></li>
<li><a class="reference internal" href="#gradnorm">3.4.3.2. GradNorm：梯度标准化方法</a></li>
<li><a class="reference internal" href="#pareto-optimization">3.4.3.3. Pareto Optimization：帕累托优化框架</a></li>
</ul>
</li>
</ul>

</div>
            </div>
        </div>

      <div class="clearer"></div>
    </div><div class="pagenation">
     <a id="button-prev" href="2.dependency_modeling.html" class="mdl-button mdl-js-button mdl-js-ripple-effect mdl-button--colored" role="botton" accesskey="P">
         <i class="pagenation-arrow-L fas fa-arrow-left fa-lg"></i>
         <div class="pagenation-text">
            <span class="pagenation-direction">Previous</span>
            <div>3.4.2. 任务依赖建模</div>
         </div>
     </a>
     <a id="button-next" href="4.summary.html" class="mdl-button mdl-js-button mdl-js-ripple-effect mdl-button--colored" role="botton" accesskey="N">
         <i class="pagenation-arrow-R fas fa-arrow-right fa-lg"></i>
        <div class="pagenation-text">
            <span class="pagenation-direction">Next</span>
            <div>3.4.4. 小结</div>
        </div>
     </a>
  </div>
        
        </main>
    </div>
  </body>
</html>